AITopics | iclr 2020

Collaborating Authors

iclr 2020

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

fa3060edb66e6ff4507886f9912e1ab9-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 05:07:49 GMT

inference, revision, short run mcmc, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

fa3060edb66e6ff4507886f9912e1ab9-AuthorFeedback.pdf

Neural Information Processing SystemsAug-17-2025, 09:08:35 GMT

inference, revision, short run mcmc, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Review for NeurIPS paper: ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

Neural Information Processing SystemsJan-25-2025, 22:30:49 GMT

I could not see a strong motivation for explicitly enforcing sparsity on architecture parameters. This is because there are already many works trying to decouple the dependency of evaluating sub-networks on the training of supernet (i.e., making the correlation higher). This means that we have ways to explicitly decouple the network evaluation with supernet training without adding a sparsity regularizaiton. As far as I know, weight-sharing methods require the BN to be re-calculated [1] to properly measure the Kendall correlation. Other works that can reduce the gap between supernet and sub-networks (e.g.

consistent neural architecture search, neurips paper, sparse coding, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.46)
Information Technology > Artificial Intelligence > Cognitive Science (0.46)

Add feedback

Sharing Knowledge in Multi-Task Deep Reinforcement Learning

D'Eramo, Carlo, Tateo, Davide, Bonarini, Andrea, Restelli, Marcello, Peters, Jan

arXiv.org Artificial IntelligenceJan-17-2024

We study the benefit of sharing representations among tasks to enable the effective use of deep neural networks in Multi-Task Reinforcement Learning. We leverage the assumption that learning from different tasks, sharing common properties, is helpful to generalize the knowledge of them resulting in a more effective feature extraction compared to learning a single task. Intuitively, the resulting set of features offers performance benefits when used by Reinforcement Learning algorithms. We prove this by providing theoretical guarantees that highlight the conditions for which is convenient to share representations among tasks, extending the wellknown finite-time bounds of Approximate Value-Iteration to the multi-task setting. In addition, we complement our analysis by proposing multi-task extensions of three Reinforcement Learning algorithms that we empirically evaluate on widely used Reinforcement Learning benchmarks showing significant improvements over the single-task counterparts in terms of sample efficiency and performance. Multi-Task Learning (MTL) ambitiously aims to learn multiple tasks jointly instead of learning them separately, leveraging the assumption that the considered tasks have common properties which can be exploited by Machine Learning (ML) models to generalize the learning of each of them. For instance, the features extracted in the hidden layers of a neural network trained on multiple tasks have the advantage of being a general representation of structures common to each other.

avg, conference paper, representation, (15 more...)

arXiv.org Artificial Intelligence

2401.09561

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
North America > United States > California > Los Angeles County > Santa Monica (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Concrete Problems in AI Safety, Revisited

Raji, Inioluwa Deborah, Dobbe, Roel

arXiv.org Artificial IntelligenceDec-18-2023

As AI systems proliferate in society, the AI community is increasingly preoccupied with the concept of AI Safety, namely the prevention of failures due to accidents that arise from an unanticipated departure of a system's behavior from designer intent in AI deployment. We demonstrate through an analysis of real world cases of such incidents that although current vocabulary captures a range of the encountered issues of AI deployment, an expanded socio-technical framing will be required for a more complete understanding of how AI systems and implemented safety mechanisms fail and succeed in real life. The rapid adoption and widespread experimentation and deployment of AI systems has triggered a variety of failures. Some are catastrophic and visible such as in the case of fatal crashes involving autonomous vehicles. Other failures are much more subtle and pernicious, such as the development of new forms of addiction to personalized content.

ai system, arxiv preprint arxiv, vehicle, (12 more...)

arXiv.org Artificial Intelligence

2401.10899

Country:

North America > United States > Texas (0.05)
North America > United States > New York > Richmond County > New York City (0.04)
North America > United States > New York > Queens County > New York City (0.04)
(5 more...)

Genre: Research Report (0.50)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Security & Privacy (1.00)
Law (0.95)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)

Add feedback

Principled Weight Initialization for Hypernetworks

Chang, Oscar, Flokas, Lampros, Lipson, Hod

arXiv.org Artificial IntelligenceDec-12-2023

Hypernetworks are meta neural networks that generate weights for a main neural network in an end-to-end differentiable manner. Despite extensive applications ranging from multi-task learning to Bayesian deep learning, the problem of optimizing hypernetworks has not been studied to date. We observe that classical weight initialization methods like Glorot & Bengio (2010) and He et al. (2015), when applied directly on a hypernet, fail to produce weights for the mainnet in the correct scale. We develop principled techniques for weight initialization in hypernets, and show that they lead to more stable mainnet weights, lower training loss, and faster convergence. Meta-learning describes a broad family of techniques in machine learning that deals with the problem of learning to learn. An emerging branch of meta-learning involves the use of hypernetworks, which are meta neural networks that generate the weights of a main neural network to solve a given task in an end-to-end differentiable manner. Hypernetworks were originally introduced by Ha et al. (2016) as a way to induce weight-sharing and achieve model compression by training the same meta network to learn the weights belonging to different layers in the main network.

hypernetwork, var, xavier, (11 more...)

arXiv.org Artificial Intelligence

2312.08399

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
Africa > Kenya (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Accelerating Meta-Learning by Sharing Gradients

Chang, Oscar, Lipson, Hod

arXiv.org Artificial IntelligenceDec-12-2023

The success of gradient-based meta-learning is primarily attributed to its ability to leverage related tasks to learn task-invariant information. However, the absence of interactions between different tasks in the inner loop leads to task-specific over-fitting in the initial phase of meta-training. While this is eventually corrected by the presence of these interactions in the outer loop, it comes at a significant cost of slower meta-learning. To address this limitation, we explicitly encode task relatedness via an inner loop regularization mechanism inspired by multi-task learning. Our algorithm shares gradient information from previously encountered tasks as well as concurrent tasks in the same task batch, and scales their contribution with meta-learned parameters. We show using two popular few-shot classification datasets that gradient sharing enables meta-learning under bigger inner loop learning rates and can accelerate the meta-training process by up to 134%.

accuracy, denote, gradient, (16 more...)

arXiv.org Artificial Intelligence

2312.08398

Country: North America > United States > California (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Spanish Pre-trained BERT Model and Evaluation Data

Cañete, José, Chaperon, Gabriel, Fuentes, Rodrigo, Ho, Jou-Hui, Kang, Hojin, Pérez, Jorge

arXiv.org Artificial IntelligenceAug-5-2023

The Spanish language is one of the top 5 spoken languages in the world. Nevertheless, finding resources to train or evaluate Spanish language models is not an easy task. In this paper we help bridge this gap by presenting a BERT-based language model pre-trained exclusively on Spanish data. As a second contribution, we also compiled several tasks specifically for the Spanish language in a single repository much in the spirit of the GLUE benchmark. By fine-tuning our pretrained Spanish model, we obtain better results compared to other BERT-based models pre-trained on multilingual corpora for most of the tasks, even achieving a new state-of-the-art on some of them. We have publicly released our model, the pre-training data, and the compilation of the Spanish benchmarks. The field of natural language processing (NLP) has made incredible progress in the last two years.

arxiv preprint arxiv, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2308.02976

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
South America > Chile (0.05)
South America > Paraguay > Asunción > Asunción (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)

Add feedback

Improved Image Wasserstein Attacks and Defenses

Hu, Edward J., Swaminathan, Adith, Salman, Hadi, Yang, Greg

arXiv.org Artificial IntelligenceMay-9-2023

A recently proposed Wasserstein distance-bounded threat model is a promising alternative that limits the perturbation to pixel mass movements. We point out and rectify flaws in the previous definition of the Wasserstein threat model and explore stronger attacks and defenses under our better-defined framework. Lastly, we discuss the inability of current Wasserstein-robust models in defending against perturbations seen in the real world. We will release our code and trained models upon publication. Deep learning approaches to computer vision tasks, such as image classification, are not robust. For example, a data point that is classified correctly can be modified in a nearly imperceptible way to cause the classifier to misclassify it (Szegedy et al., 2013; Goodfellow et al., 2015).

artificial intelligence, machine learning, threat model, (19 more...)

arXiv.org Artificial Intelligence

2004.12478

Country: North America > United States > Washington > King County > Redmond (0.04)

Genre: Research Report (0.64)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

What can human minimal videos tell us about dynamic recognition models?

Ben-Yosef, Guy, Kreiman, Gabriel, Ullman, Shimon

arXiv.org Artificial IntelligenceApr-19-2021

Published as a workshop paper at "Bridging AI and Cognitive Science" (ICLR 2020) In human vision objects and their parts can be visually recognized from purely spatial or purely temporal information but the mechanisms integrating space and time are poorly understood. Here we show that human visual recognition of objects and actions can be achieved by efficiently combining spatial and motion cues in configurations where each source on its own is insufficient for recognition. This analysis is obtained by identifying minimal videos: these are short and tiny video clips in which objects, parts, and actions can be reliably recognized, but any reduction in either space or time makes them unrecognizable. State-of-the-art deep networks for dynamic visual recognition cannot replicate human behavior in these configurations. This gap between humans and machines points to critical mechanisms in human dynamic vision that are lacking in current models.

configuration, minimal video, video, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.cognition.2020.104263

2104.09447

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback